Byzantine Fault Tolerance
نویسنده
چکیده
I am excited by the challenge of making distributed systems reliable and robust to failures. Distributed systems form the backbone of a variety of services that play an important part in daily life: email, e-commerce, and air traffic control are a few examples. The impact of failures of such services ranges from the inconvenience of lost email, to the hassles of delayed flights, to financial losses and even closure of companies. It is difficult to design reliable distributed systems because individual computers and the networks connecting them can fail in a variety of ways. Each computer or network failure can lead to an unplanned behavior of individual components and the system as a whole. Can we build reliable systems without considering every possible failure scenario? My research focuses on the development of end-to-end techniques for building reliable systems that are general, practical, and theoretically sound. Specifically, general techniques cover a wide range of faults and are easy to incorporate into a variety of new and legacy systems; practical techniques impose low overheads, provide robust performance in the presence of failures, and are based on realistic and tenable system models; theoretically sound techniques provide welldefined guarantees under well-defined assumptions. I believe that all three properties are important – techniques that are not general have limited deployment potential, techniques that are not practical will not be used, and techniques that are not theoretically sound may not work as advertised.
منابع مشابه
Tangaroa: a Byzantine Fault Tolerant Raft
We propose a Byzantine Fault Tolerant variant of the Raft consensus algorithm, BFTRaft, inspired by the original Raft[1] algorithm and the Practical Byzantine Fault Tolerance algorithm[2]. BFT Raft maintains the safety, fault tolerance, and liveness properties of Raft in the presence of Byzantine faults, while also aiming towards to Raft’s goal of simplicity and understandability. We have imple...
متن کاملDistributed Computing Column 39: Byzantine Generals: The Next Generation
The relevance of Byzantine fault tolerance in the context of cloud computing has been questioned[3]. While arguments against Byzantine fault tolerance seemingly makes sense in the context of a singlecloud, i.e., a large-scale cloud infrastructure that resides under control of a single, typically commercialprovider, these arguments are less obvious in a broader context of the Int...
متن کاملByung-gon Chun
International Computer Science Institute, Berkeley, CA 2007 – Present Postdoctoral Researcher, Networking Group, working with Prof. Scott Shenker, Dr. Petros Maniatis, and Dr. Sylvia Ratnasamy Diverse replication for single-machine Byzantine-fault tolerance: Investigate exploiting cores in many-core systems to defend against software attacks. Explore different isolation and software diversity m...
متن کاملA Robust Byzantine Fault-Tolerant Replication Technique for Peer-to-Peer Content Distribution
Problem statement: In peer-to-peer networks, Byzantine fault tolerance refers to the capability of a system to tolerate Byzantine faults. It can be achieved by replicating the server and by ensuring all server replicas reach an agreement on the input despite Byzantine faulty replicas and clients. Since malicious attacks and software errors can cause faulty nodes to exhibit Byzantine behavior, B...
متن کاملByzantine Fault Tolerance Can Be Fast
Byzantine fault tolerance is important because it can be used to implement highly-available systems that tolerate arbitrary behaviorfrom faulty components. This paper presents a detailed performance evaluation of BFT, a state-machine replication algorithm that tolerates Byzantine faults in asynchronous systems. Our results contradict the common belief that Byzantine fault tolerance is too slow ...
متن کامل